Combining Lazy Learning, Racing and Subsampling for Effective Feature Selection
نویسندگان
چکیده
This paper presents a wrapper method for feature selection that combines Lazy Learning, racing and subsampling techniques. Lazy Learning (LL) is a local learning technique that, once a query is received, extracts a prediction by locally interpolating the neighboring examples of the query which are considered relevant according to a distance measure. Local learning techniques are often criticized for their limitations in dealing with problems with high number of features and large samples. Similarly wrapper methods are considered prohibitive for large number of features, due to the high cost of the evaluation step. The paper aims to show that a wrapper feature selection method based on LL can take advantage of two effective strategies: racing and subsampling. While the idea of racing was already proposed by Maron and Moore, this paper goes a step further by (i) proposing a multiple testing technique for less conservative racing (ii) combining racing with sub-sampling techniques.
منابع مشابه
The Racing Algorithm : Model Selection for LazyLearnersOded
Given a set of models and some training data, we would like to nd the model that best describes the data. Finding the model with the lowest generalization error is a computationally expensive process, especially if the number of testing points is high or if the number of models is large. Optimization techniques such as hill climbing or genetic algorithms are helpful but can end up with a model ...
متن کاملA hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملDiversity in Ensemble Feature Selection
Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. Ensembles allow us to achieve higher accuracy, which is often not achievable with single models. It was shown theoretically and experimentally that in order for an ensemble to be effective, it should consist of high-accuracy base classifiers that should have high diversity in their pred...
متن کاملBridging the semantic gap for software effort estimation by hierarchical feature selection techniques
Software project management is one of the significant activates in the software development process. Software Development Effort Estimation (SDEE) is a challenging task in the software project management. SDEE is an old activity in computer industry from 1940s and has been reviewed several times. A SDEE model is appropriate if it provides the accuracy and confidence simultaneously before softwa...
متن کاملMental Arithmetic Task Recognition Using Effective Connectivity and Hierarchical Feature Selection From EEG Signals
Introduction: Mental arithmetic analysis based on Electroencephalogram (EEG) signal for monitoring the state of the user’s brain functioning can be helpful for understanding some psychological disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, or dyscalculia where the difficulty in learning or understanding the arithmetic exists. Most mental arithmetic recogni...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004